Skip to content

Conversation

@jimfcarroll
Copy link

The Spark ParquetRelation.scala code makes the assumption that the parquet.Log class has already been loaded. If ParquetRelation.enableLogForwarding executes prior to the parquet.Log class being loaded then the code in enableLogForwarding has no affect.

ParquetRelation.scala attempts to override the parquet logger but, at least currently (and if your application simply reads a parquet file before it does anything else with Parquet), the parquet.Log class hasn't been loaded yet. Therefore the code in ParquetRelation.enableLogForwarding has no affect. If you look at the code in parquet.Log there's a static initializer that needs to be called prior to enableLogForwarding or whatever enableLogForwarding does gets undone by this static initializer.

The "fix" would be to force the static initializer to get called in parquet.Log as part of enableForwardLogging.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@marmbrus
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Nov 14, 2014

Test build #23388 has started for PR 3271 at commit 37bdff7.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 14, 2014

Test build #23388 has finished for PR 3271 at commit 37bdff7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23388/
Test PASSed.

@pwendell
Copy link
Contributor

LGTM - good catch on this one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bonus points if you can file a JIRA with them and follow up on this :) They are generally pretty open/responsive.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I posted a suggestion to their dev-list. We'll see where it goes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!

@marmbrus
Copy link
Contributor

Thanks a lot for looking into this and for the fast turn around!

@marmbrus
Copy link
Contributor

Merged to master and 1.2.

asfgit pushed a commit that referenced this pull request Nov 15, 2014
The Spark ParquetRelation.scala code makes the assumption that the parquet.Log class has already been loaded. If ParquetRelation.enableLogForwarding executes prior to the parquet.Log class being loaded then the code in enableLogForwarding has no affect.

ParquetRelation.scala attempts to override the parquet logger but, at least currently (and if your application simply reads a parquet file before it does anything else with Parquet), the parquet.Log class hasn't been loaded yet. Therefore the code in ParquetRelation.enableLogForwarding has no affect. If you look at the code in parquet.Log there's a static initializer that needs to be called prior to enableLogForwarding or whatever enableLogForwarding does gets undone by this static initializer.

The "fix" would be to force the static initializer to get called in parquet.Log as part of enableForwardLogging.

Author: Jim Carroll <jim@dontcallme.com>

Closes #3271 from jimfcarroll/parquet-logging and squashes the following commits:

37bdff7 [Jim Carroll] Fix Spark's control of Parquet logging.

(cherry picked from commit 37482ce)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@asfgit asfgit closed this in 37482ce Nov 15, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants